* Stata do-file, Robustness tests for Chapter 3, Mark R. Beissinger, 
*    The Revolutionary City  
version 14
clear all
set more off
log using Robustnesstestfiles\Logfiles\robustnesstestschapter3.log, replace
* ============================================================================
* ROBUSTNESS CHECKS FOR STATISTICAL RESULTS APPEARING IN CHAPTER 3
* STATA Do file for Chapter 3 
* Robustness checks for results reported in Chapter 3  
* Author: Mark R. Beissinger  
* Date:  January 2022  
* Princeton, NJ 
* =============================================================================
* BEFORE RUNNING, YOU MUST SET THE DEFAULT PATH FOR WHERE THE DATA
*   FILES RESIDE
* ============================================================================
* The following datafile is used in this file:
*   Panel data for revolutionary episodes--revspredictbycntryyr.dta
* ============================================================================
* Before running, download the following packages for STATA:
* firthlogit from http://fmwww.bc.edu/RePEc/bocode/f
* relogit from https://gking.harvard.edu/relogit
* checkrob from http://fmwww.bc.edu/RePEc/bocode/c
* qic (as discussed in https://www.stata-journal.com/article.html?article=st0126)
*	--install from http://www.stata-journal.com/software/sj8-1/
* =============================================================================
* The following output is produced by these robustness tests:
* 		Robustnesstestfiles\Logfiles\robustnesstestschapter3.log
*
*		In addition, the following graphs of imputed vs. observed observa-
*			tions were produced:
*				Robustnesstestfiles\Logfiles\mod4impobsunder5mortl.pdf
*				Robustnesstestfiles\Logfiles\mod4impobspercurbanl.pdf
*				Robustnesstestfiles\Logfiles\mod4impobstotalyrsschooll.pdf
*				Robustnesstestfiles\Logfiles\mod4impobsmilperthousl.pdf
*				Robustnesstestfiles\Logfiles\mod4impobsyouthpercl.pdf
*				Robustnesstestfiles\Logfiles\mod4impobstottradepernomgdpl.pdf
*				Robustnesstestfiles\Logfiles\mod4impobslndollexchratel.pdf
*				Robustnesstestfiles\Logfiles\mod4impobsageincleader.pdf
*				Robustnesstestfiles\Logfiles\mod4impobslnmilexppersoldthl.pdf
*	These files have been combined into a single pdf version of the
*			output file, located in the Robustnesstestfiles\Outputfiles folder
*	In addition, the reworked output from the checkrob procedure run in this
*	   chapter can be viewed in the Excel file checkrob.results.chapter3.xlsx,
*		also located in the Robustnesstestfiles\Outputfiles folder	
* =============================================================================

use revspredictbycntryyr.dta

* =====================================================================================
* ROBUSTNESS CHECKS: CHECKING QUADRATURE FOR BIVARIATE REGRESSIONS (FIGURES 3.1 TO 3.4)
* =====================================================================================
* Figure 3.1. Polity score
* Urban civic
xtcloglog urbancivicny  polityl c.polityl#c.polityl c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
* Quadrature test
quadchk, nooutput
* --Passed:  all coefficients change by less than .01 
* Social
xtcloglog leftistny polityl c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
* Quadrature test
quadchk, nooutput
* --Passed:  all coefficients change by less than .01 

* On relationship of non-democratic regime-types to probability of onset for urban civic and social revolutionary episodes (Geddes data)
* Urban civic
xtcloglog urbancivicny  gedpartyautoc gedmilautoc gedmonautoc gedpersautoc c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
* Quadrature test
quadchk, nooutput
* --Passed:  all coefficients change by less than .01 
* Social
xtcloglog leftistny  gedpartyautoc gedmilautoc gedmonautoc gedpersautoc c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
* Quadrature test
quadchk, nooutput
* --Passed:  all coefficients change by less than .01 

* Figure 3.2. Years incumbent leader in power
* Statistically significant relationship between yrsincleaderinpower and urban civic episodes
* No statistically significant relationship between yrsincleaderinpower and social revolutionary episodes (various polynomial forms tested)
* Urban civic
xtcloglog urbancivicny yrsincleaderinpower c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
* Quadrature test
quadchk, nooutput
* --Passed:  all coefficients change by less than .01 
* Social
xtcloglog leftistny yrsincleaderinpower c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
* Quadrature test
quadchk, nooutput
* --Passed:  all coefficients change by less than .01 
* Testing polynomial forms
xtcloglog leftistny c.yrsincleaderinpower##c.yrsincleaderinpower c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
* Not statistically significant
xtcloglog leftistny c.yrsincleaderinpower##c.yrsincleaderinpower##c.yrsincleaderinpower c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
* Not statistically significant

* V-Dem executive corruption measure
* Urban civic
xtcloglog urbancivicny v2x_execorr c.time1##c.time1##c.time1 if indstate==1,  eform nolog vce(robust)
* Quadrature test
quadchk, nooutput
* --Passed:  all coefficients change by less than .01 
* Social
xtcloglog leftistny v2x_execorr c.time1##c.time1##c.time1 if indstate==1,  eform nolog vce(robust)
* Quadrature test
quadchk, nooutput
* --Passed:  all coefficients change by less than .01 

* Figure 3.3. GDP per capita
* Urban civic 
* Linear vs. Quadratic specification: quadratic is better (lower BIC and AIC)
xtcloglog urbancivicny c.gdppcthl c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
estat ic
xtcloglog urbancivicny c.gdppcthl##c.gdppcthl c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
estat ic
* Quadrature test on quadratic
quietly: xtcloglog urbancivicny c.gdppcthl##c.gdppcthl c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
quadchk, nooutput
* --Passed:  all coefficients change by less than .01 
* Social
* Linear vs. quadratic specification: linear is better (lower BIC and AIC; quadratic not significant)
xtcloglog leftistny gdppcthl c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
estat ic
xtcloglog leftistny c.gdppcthl##c.gdppcthl c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
estat ic
* Quadrature test on linear
quietly: xtcloglog leftistny gdppcthl c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
quadchk, nooutput
* --Passed:  all coefficients change by less than .01 

* Figure 3.4. Economic growth
* Urban civic
xtcloglog urbancivicny gdppcgrow1yrl c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
* Quadrature test
quadchk, nooutput
* --Passed:  all coefficients change by less than .01 
* Social
xtcloglog leftistny gdppcgrow1yrl c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
* Quadrature test
quadchk, nooutput
* --Passed:  all coefficients change by less than .01 

* Absence of relationship of economic growth to urban civic revolution in upper-income countries
* 	More consistent relationship among lower middle-income countries
xtcloglog urbancivicny i.gdppcquartersl##c.gdppcgrow1yrl c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
* Quadrature test
quadchk, nooutput
* --Passed:  all coefficients change by less than .01 

* Lack of bivariate relationship between oil production and revolution (either social or urban civic)
* Urban civic
xtcloglog urbancivicny lnoill c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
* Quadrature test
quadchk, nooutput
* --Passed:  all coefficients change by less than .01 
* Social
xtcloglog leftistny lnoill c.time1##c.time1##c.time1 if indstate==1, vce(robust) eform nolog
* Quadrature test
quadchk, nooutput
* --Passed:  all coefficients change by less than .01 


* =============================================================================
* ROBUSTNESS TESTS FOR URBAN CIVIC REVOLUTIONARY EPISODES MODEL  (TABLE 3.1)
* =============================================================================
* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
* Robustness test for possible issues of multicollinearity
* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
* Checking for possible multicollinearity using variance inflation factors, with variables in Model 4 in Table 3.1
quietly: reg urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar if indstate==1
estat vif
* RESULT:  GDP per capita and its quadratic term are potentially problematic,
* 	as VIF approaches or is greater than 10
*  To test this whetheer multicollinearity affected the results, as recommended
*	in some sources, I centered the variable and re-ran the regression to see 
*	if centering the variable changed any parameters of the model
sum gdppcthl if indstate==1
local gdppcmean = r(mean)
generate centeredgdppcthl= gdppcthl - `gdppcmean' if indstate==1
generate centeredgdppcthl2 = centeredgdppcthl * centeredgdppcthl if indstate==1
quietly: reg urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar if indstate==1
estat vif
* New gdppcthl variables now have variance inflation factor within reasonable limits
* Then checked to see if it made a difference in the regression outcomes
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 gdppcgrow1yrl polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar if indstate==1, eform nolog
estimates store mod1
xtcloglog urbancivicny lnpopl centeredgdppcthl centeredgdppcthl2 gdppcgrow1yrl polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar if indstate==1, eform nolog
estimates store mod2
* RESULT: No difference in patterns of statistical significance on visual inspection
* Hausman test of the two models
hausman mod1 mod2
* RESULT:  No difference in the regression coefficients of the two models: can 
*	safely keep gdppcthl and gdppcthl2 in the specification
drop _est_mod1 _est_mod2
macro drop _all
drop centeredgdppcthl centeredgdppcthl2

* +++++++++++++++++++++++++++
* Bootstrap country-clusters 
* +++++++++++++++++++++++++++
* WARNING:  CAN TAKE A WHILE TO COMPUTE
* Must use pooled model, with clustered standard errors
* Warning:  will be reloading dataset after this command
clear
use revspredictbycntryyr, clear
xtset, clear
bootstrap , reps(1000) cluster(cowcode) idcluster(newid) : cloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar if indstate==1, vce(cluster cowcode) eform
estat bootstrap, eform all
*  Result:  no changes in signs or patterns of statistical significance
clear
use revspredictbycntryyr.dta

* ++++++++++++++++++++++++++++
* Other estimation techniques
* ++++++++++++++++++++++++++++
* Rare event framework for Model 4
* probability of an urban civic revolt across all cases in the sample (.005) taken as the pc parameter
relogit urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar if indstate==1, cluster(cowcode) pc (.0048)
*	RESULT: all findings remain unchanged

* Population-averaged complementary log-log panel framework with different correlation structures for Model 4
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar time1 timesq timecub if  indstate==1, pa corr(exc) eform nolog vce(robust)
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar time1 timesq timecub if  indstate==1, pa corr(ar1) eform nolog vce(robust)
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar time1 timesq timecub if  indstate==1, pa corr(ind) eform nolog vce(robust)
*	RESULT:  all findings remain unchanged
* QIC (quasilikelihood under the independence model criterion, QIC) test for model selection for a population-averaged model (Model 4)
*	For testing which correlation structure is best
* 	The correlation structure with the smallest QIC is the preferred correlation structure
*	See Cui, James. 2007. "QIC Program and Model Selection in GEE Analyses," The Stata Journal 7, 2: 209-220.
* Generate common sample
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar time1 timesq timecub if indstate==1, pa corr(ar1) vce(robust) eform nolog
generate sample=e(sample)
* Testing ar1 on common sample
qic urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar time1 timesq timecub if indstate==1 & sample,  corr(ar1) link(cloglog) family(binomial) robust eform nolog
* Testing exc on common sample)
qic urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar time1 timesq timecub if indstate==1 & sample,  corr(exc) link(cloglog) family(binomial) robust eform nolog
* Testing ind on common sample
qic urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar time1 timesq timecub if indstate==1 & sample,  corr(ind) link(cloglog) family(binomial) robust eform nolog
* an exchangeable (equal correlation) structure proved the most efficient (lowest qic)
drop sample

* +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
* Testing robustness of specification to inclusion or exclusion of variables
* +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
* checkrob procedure using Model 4
* Testing if signs are stable and z-statistics for Beta/S.E. of Beta are >=1.96 (i.e., .05 level of significance)
*	or >=1.65 (i.e., .10 levels of significance)
*   z-statistics were calculated in Excel (as comma-separated tables) from the output produced by checkrob 
* BE AWARE that running checkrob can take a considerable amount of time
* There is also sometimes a glitch in the checkrob procedure
*    In the tables produced, the results for the first variable of a quadratic specification sometimes falsely 
*       include some results from the squared variable    
*    MUST VISUALLY INSPECT AND, IF PRESENT, HAND-CORRECT AND RECALCULATE RESULTS FOR THOSE VARIABLES
*    (This was done for gdppcthl and gdppcthl2 and polityl and polityl2 below)
* +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
** Model 4 (testing additional elements)
* postcoldwar kept in all specifications as time control
* I HAVE PROVIDED THE EXCEL TABLE WITH RESULTS OF THE TEST BELOW, CORRECTED FOR THE ABOVE PROBLEMS
* THE RESULTS ARE PRESENTED WITH OTHER ROBUSTNESS TESTS FILES IN THE FILE checkrob.results.chapter3.xlsx
* COMMAND USED: checkrob 1 8 ch3tab1mod4.txt: xtcloglog urbancivicny postcoldwar lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill if indstate==1
* RESULTS:  All variables had stable signs and were statistically significant, with exceptions of v2x_execorr and lnoill
* 		v2x_execorr:  no change of signs, but significant at the .05 level in only 75 percent of specifications (problematic when gdppcthl is dropped)
*		lnoill:  change of signs in 12.5% of specifications (when gdppcthl is not included)
*				 significant at the .05 or .10 levels only in 40.5% of specifications (insignificant when either gdppcthl or polityl are dropped)

* Follow-up test: Likelihood ratio test for whether inclusion of oil production improves the accuracy of the model
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar if indstate==1, eform nolog
generate sample=e(sample)
estimates store mod1
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr  postcoldwar if indstate==1 & sample==1, eform nolog
estimates store mod2
lrtest mod1 mod2
drop _est_mod1 _est_mod2
* RESULT:  Significant improvement of the model when lnoill is included

* ++++++++++++++++++++++++++++++++++++++++++++
* Tests for omitted variable bias in Model 4
* ++++++++++++++++++++++++++++++++++++++++++++
* Logged military spending per soldier: lnmilexppersoldthl
* Bivariate, controlling for time
xtcloglog urbancivicny lnmilexppersoldthl time1 timesq timecub if indstate==1, vce(robust) eform nolog
* Result:  negative and statistically signficant at the .05 level in bivariate relationship, controlling for time
* Multivariate cloglog panel, Model 4 from Table 3.1
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar lnmilexppersoldthl if indstate==1, vce(robust) eform nolog
quadchk, nooutput
* Failed quadchk--recalculate using pooled model
cloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar lnmilexppersoldthl if indstate==1, vce(robust) eform nolog
*  Result:  
*    --lnmilexppersoldthl statistically insignificant
*    --no sign changes or change in statistical significance of all other variables

* Military soldiers per 1000 pop:  milperthousl
* Bivariate, controlling for time
xtcloglog urbancivicny milperthousl time1 timesq timecub if indstate==1, vce(robust) eform nolog
* Result:  postive and statistically insignificant in bivariate relationship, controlling for time
* Multivariate cloglog panel, Model 4 from Table 3.1
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar milperthousl if indstate==1, vce(robust) eform nolog
quadchk, nooutput
* Failed quadchk--recalculate using pooled model
cloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar milperthousl if indstate==1, vce(robust) eform nolog
*  Result  
*    --milperthousl marginally statistically significant at the .10 level
*    --no sign change or change in statistical significance of all other variables

* Mean years of schooling: totalyrsschooll
* Bivariate, controlling for time
xtcloglog urbancivicny totalyrsschooll time1 timesq timecub if indstate==1, vce(robust) eform nolog
* Result:  negative and statistically insignificant in bivariate relationship, controlling for time
* Multivariate cloglog panel, Model 4 from Table 3.1
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar totalyrsschooll if indstate==1, vce(robust) eform nolog
quadchk, nooutput
* Failed quadchk--recalculate using pooled model
cloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar totalyrsschooll if indstate==1, vce(robust) eform nolog
*  Result  
*    --sample is highly reduced since data for totalyrsschooll only exist since 1951 (n=6,853)
*    --totalyrsschooll is statistically insignificant 
*    --no sign change or change in statistical significance for all other variables

* Levels of urbanization:  percurbanl
* Bivariate, controlling for time
xtcloglog urbancivicny percurbanl time1 timesq timecub if indstate==1, vce(robust) eform nolog
* Result:  statistically insignificant in bivariate relationship, controlling for time
* Multivariate cloglog panel, Model 4 from Table 3.1
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar percurbanl if indstate==1, vce(robust) eform nolog
quadchk, nooutput
* Failed quadchk--recalculate using pooled model
cloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar percurbanl if indstate==1, vce(robust) eform nolog
*  Result  
*    --percurbanl is statistically insignificant
*    --no sign change or change in statistical significance for all other variables
* Urbanization is significant when gdppcthl is dropped due to high correlation between the two (r=.68)
cloglog urbancivicny lnpopl  polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar percurbanl if indstate==1, vce(robust) eform nolog

* Population density: lnpopdensityl
* Bivariate, controlling for time
xtcloglog urbancivicny lnpopdensityl time1 timesq timecub if indstate==1, vce(robust) eform nolog
* Result:  positive and statistically significant in bivariate relationship, controlling for time
* Multivariate cloglog panel, Model 4 from Table 3.1
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar lnpopdensityl if indstate==1, vce(robust) eform nolog
quadchk, nooutput
* Failed quadchk--recalculate using pooled model
cloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar lnpopdensityl if indstate==1, vce(robust) eform nolog
*  Result 
*    --lnpopdensityl is statistically insignificant 
*    --no sign change or change in statistical significance for all other variables

* Youth as proportion of population: youthpercl
* Bivariate, controlling for time
xtcloglog urbancivicny youthpercl time1 timesq timecub if indstate==1, vce(robust) eform nolog
* Result:  positive and statistically insignificant in bivariate relationship, controlling for time
* Multivariate cloglog panel, Model 4 from Table 3.1
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar youthpercl if indstate==1, vce(robust) eform nolog
quadchk, nooutput
* Failed quadchk--recalculate using pooled model
cloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar youthpercl if indstate==1, vce(robust) eform nolog
*  Result 
*    --sample is reduced because data on youthpercl only begins in 1951 (n=8,111)
*    --youthpercl is negative (opposite of what theory would say) and marginally significant at the .10 level
*    --no sign change or change in statistical significance for all other variables

* Trade as percent of GDP: tottradepernomgdpl 
* Bivariate, controlling for time
xtcloglog urbancivicny tottradepernomgdpl time1 timesq timecub if indstate==1, vce(robust) eform nolog
* Result:  negative and statistically insignificant  in bivariate relationship, controlling for time
* Multivariate cloglog panel, Model 4 from Table 3.1
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar tottradepernomgdpl if indstate==1, vce(robust) eform nolog
quadchk, nooutput
*  Result 
*    --sample is reduced because of missing data on tottradepernomgdpl (n=9,310), with only 43 urban civic episodes and 150 countries
*    --tottradepernomgdpl is statistically insignificant 
*    --yrsincleaderinpower and v2x_execorr grow marginally significant at .10 level
*    --no sign change or change in statistical significance for all other variables

* Logged dollar exchange rate:  lndollexchratel 
* Bivariate, controlling for time
xtcloglog urbancivicny lndollexchratel time1 timesq timecub if indstate==1, vce(robust) eform nolog
* Result:  positive and statistically insignificant in bivariate relationship, controlling for time
* Multivariate cloglog panel, Model 4 from Table 3.1
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar lndollexchratel if indstate==1, vce(robust) eform nolog
quadchk, nooutput
*  Result 
*    --sample is highly reducted (n=8,455)
*    --lndollexchratel is statistically insignificant
*    --polity grows marginally significant at the .10 level, and polityl2 becomes statistically insignificant
*    --no sign change or change in statistical significance for all other variables

* Presence of financial crisis: rrfinstressl
* Bivariate, controlling for time
xtcloglog urbancivicny rrfinstressl time1 timesq timecub if indstate==1, vce(robust) eform nolog
* Result:  positive and statistically insignificant in bivariate relationship, controlling for time
* Multivariate cloglog panel, Model 4 from Table 3.1
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar rrfinstressl if indstate==1, vce(robust) eform nolog
quadchk, nooutput
* Does not pass quadchk--recalculate using pooled model
cloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar rrfinstressl if indstate==1, vce(robust) eform nolog
*  Result 
*    --sample is highly reduced because of missing data on tradepergdpl (n=5,887) with 19 urban civic episodes and 67 countries
*    --rrfinstressl is statistically insignificant
*    --polityl2 and yrsincleaderinpower grow statistically insignificant
*    --no sign change or change in statistical significance for all other variables

*  Child mortality:  under5mortl
* Bivariate, controlling for time
xtcloglog urbancivicny under5mortl time1 timesq timecub if indstate==1, vce(robust) eform nolog
* Result:  negative and statistically insignificant in bivariate relationship, controlling for time
* Multivariate cloglog panel, Model 4 from Table 3.1
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar under5mortl if indstate==1, vce(robust) eform nolog
quadchk, nooutput
* Does not pass quadchk--recalculate using pooled model
cloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar under5mortl if indstate==1, vce(robust) eform nolog
*  Result 
*    --under5mortl is negative and marginally significant at the .10 level
*    --no sign change or change in statistical significance for all other variables

* Age of incumbent leader: ageincleader
* Bivariate, controlling for time
xtcloglog urbancivicny ageincleader time1 timesq timecub if indstate==1, vce(robust) eform nolog
* Result:  positive and statistically significant in bivariate relationship, controlling for time
* Multivariate cloglog panel, Model 4 from Table 3.1
xtcloglog urbancivicny ageincleader time1 timesq timecub if indstate==1, vce(robust) eform nolog
* Result:  positive and statistically signficant in bivariate relationship, controlling for time
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar ageincleader if indstate==1, vce(robust) eform nolog
quadchk, nooutput
*  Result 
*    --ageincleader is statistically insignificant 
*    --yrsincleaderinpower grows statistically insignificant 
*    --no sign change or change in statistical significance for all other variables

* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
* Further testing for omitted variable bias in Multiple Imputation Model 4
* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
* Under 5 child mortality--under5mortl
* On 10 imputed datasets, testing Model 4 from Table 3.1
clear
use revspredictbycntryyr, clear
drop if indstate==0
misstable sum under5mortl, gen(miss)
mi set wide
mi register regular lnpopl postcoldwar cowcode year urbancivicny 
mi register imputed polityl gdppcthl  yrsincleaderinpower v2x_execorr lnoill under5mortl
mi xtset cowcode year
mi impute chained (pmm, knn(10)) polityl gdppcthl yrsincleaderinpower v2x_execorr lnoill under5mortl = lnpopl postcoldwar, add(10) rseed(1234) force dots chaindots
qui mi xeq 1: twoway (kdensity under5mortl if missunder5mortl==0) || (kdensity under5mortl if missunder5mortl==1) || (kdensity under5mortl), legend(label(1 "Observed") label(2 "Imputed") label(3 "Completed"))
graph export Robustnesstestfiles\Logfiles\mod4impobsunder5mortl.pdf, replace
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny lnpopl c.gdppcthl##c.gdppcthl c.polityl##c.polityl yrsincleaderinpower v2x_execorr lnoill postcoldwar under5mortl, vce(robust)
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny under5mortl time1 timesq timecub, vce(robust)
*	RESULTS:  under5mortl--negative and marginally signficiant at the .10 level, did not change significance or signs of other variables 
*		  no bivariate relationship

* Percent urban--percurbanl
* On 10 imputed datasets, testing Model 4 from Table 3.1
clear
use revspredictbycntryyr, clear
drop if indstate==0
misstable sum percurbanl, gen(miss)
mi set wide
mi register regular lnpopl postcoldwar cowcode year urbancivicny 
mi register imputed polityl gdppcthl  yrsincleaderinpower v2x_execorr lnoill percurbanl
mi xtset cowcode year
mi impute chained (pmm, knn(10)) polityl gdppcthl yrsincleaderinpower v2x_execorr lnoill percurbanl = lnpopl postcoldwar, add(10) rseed(1234) force dots chaindots
qui mi xeq 1: twoway (kdensity percurbanl  if misspercurbanl==0) || (kdensity percurbanl if misspercurbanl==1) || (kdensity percurbanl), legend(label(1 "Observed") label(2 "Imputed") label(3 "Completed"))
graph export Robustnesstestfiles\Logfiles\mod4impobspercurbanl.pdf, replace
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny lnpopl c.gdppcthl##c.gdppcthl c.polityl##c.polityl yrsincleaderinpower v2x_execorr lnoill postcoldwar percurbanl, vce(robust)
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny percurbanl time1 timesq timecub, vce(robust)
*	RESULTS:  percurbanl--positive and statistically insignificant, did not change significance or signs of other variables
*		  no bivariate relationship

* Average years of schooling--totalyrsschooll
* On 10 imputed datasets, testing Model 4 from Table 3.1
clear
use revspredictbycntryyr, clear
drop if indstate==0
misstable sum totalyrsschooll, gen(miss)
mi set wide
mi register regular lnpopl postcoldwar cowcode year urbancivicny 
mi register imputed polityl gdppcthl  yrsincleaderinpower v2x_execorr lnoill totalyrsschooll
mi xtset cowcode year
mi impute chained (pmm, knn(10)) polityl gdppcthl yrsincleaderinpower v2x_execorr lnoill totalyrsschooll = lnpopl postcoldwar, add(10) rseed(1234) force dots chaindots
qui mi xeq 1: twoway (kdensity totalyrsschooll if misstotalyrsschooll==0) || (kdensity totalyrsschooll if misstotalyrsschooll==1) || (kdensity totalyrsschooll), legend(label(1 "Observed") label(2 "Imputed") label(3 "Completed"))
graph export Robustnesstestfiles\Logfiles\mod4impobstotalyrsschooll.pdf, replace
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny lnpopl c.gdppcthl##c.gdppcthl c.polityl##c.polityl yrsincleaderinpower v2x_execorr lnoill postcoldwar totalyrsschooll, vce(robust)
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny totalyrsschooll time1 timesq timecub, vce(robust)
*	RESULTS:  totalyrsschooll--positive and statistically insignificant, did not change significance or signs of other variables
*		  bivariate relationship is negative and insignificant

* Military personnel per thousand pop--milperthousl
* On 10 imputed datasets, testing Model 4 from Table 3.1
clear
use revspredictbycntryyr, clear
drop if indstate==0
misstable sum milperthousl, gen(miss)
mi set wide
mi register regular lnpopl postcoldwar cowcode year urbancivicny 
mi register imputed polityl gdppcthl  yrsincleaderinpower v2x_execorr lnoill milperthousl
mi xtset cowcode year
mi impute chained (pmm, knn(10)) polityl gdppcthl yrsincleaderinpower v2x_execorr lnoill milperthousl = lnpopl postcoldwar, add(10) rseed(1234) force dots chaindots
qui mi xeq 1: twoway (kdensity milperthousl if missmilperthousl==0) || (kdensity milperthousl if missmilperthousl==1) || (kdensity milperthousl), legend(label(1 "Observed") label(2 "Imputed") label(3 "Completed"))
graph export Robustnesstestfiles\Logfiles\mod4impobsmilperthousl.pdf, replace
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny lnpopl c.gdppcthl##c.gdppcthl c.polityl##c.polityl yrsincleaderinpower v2x_execorr lnoill postcoldwar milperthousl, vce(robust)
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny milperthousl time1 timesq timecub, vce(robust)
*	RESULTS:  milperthousl--negative and statistically insignificant, did not change significance or signs of other variables
*		  no bivariate relationship

* Population density--lnpopdensityl
* Does not need multiple imputation, testing Model 4 from Table 3.1
clear
use revspredictbycntryyr, clear
drop if indstate==0
mi set wide
mi register regular lnpopl postcoldwar cowcode year urbancivicny lnpopdensityl
mi register imputed polityl gdppcthl  yrsincleaderinpower v2x_execorr lnoill 
mi xtset cowcode year
mi impute chained (pmm, knn(10)) polityl gdppcthl yrsincleaderinpower v2x_execorr lnoill = lnpopl postcoldwar lnpopdensityl, add(10) rseed(1234) force dots chaindots
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny lnpopl c.gdppcthl##c.gdppcthl c.polityl##c.polityl yrsincleaderinpower v2x_execorr lnoill postcoldwar lnpopdensityl, vce(robust)
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny lnpopdensityl time1 timesq timecub, vce(robust)
*	RESULTS:  lnpopdensityl--positive but statistically insignificant; did not change significance or signs of other variables
*		  bivariate relationship significant at the .05 level, controlling for time

* Youth bulge--youthpercl
* On 10 imputed datasets, testing Model 4 from Table 3.1
clear
use revspredictbycntryyr, clear
drop if indstate==0
misstable sum youthpercl, gen(miss)
mi set wide
mi register regular lnpopl postcoldwar cowcode year urbancivicny 
mi register imputed polityl gdppcthl  yrsincleaderinpower v2x_execorr lnoill youthpercl
mi xtset cowcode year
mi impute chained (pmm, knn(10)) polityl gdppcthl yrsincleaderinpower v2x_execorr lnoill youthpercl = lnpopl postcoldwar, add(10) rseed(1234) force dots chaindots
qui mi xeq 1: twoway (kdensity youthpercl if missyouthpercl==0) || (kdensity youthpercl if missyouthpercl==1) || (kdensity youthpercl), legend(label(1 "Observed") label(2 "Imputed") label(3 "Completed"))
graph export Robustnesstestfiles\Logfiles\mod4impobsyouthpercl.pdf, replace
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny lnpopl c.gdppcthl##c.gdppcthl c.polityl##c.polityl yrsincleaderinpower v2x_execorr lnoill postcoldwar youthpercl, vce(robust)
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny youthpercl time1 timesq timecub, vce(robust)
*	RESULTS:  youthpercl--negative and statistically insignificant, did not change significance or signs of other variables
*		  positive and insignificant in the bivariate relationship

* Nominal trade as percent of nominal GDP--tottradepernomgdpl
* On 10 imputed datasets, testing Model 4 from Table 3.1
clear
use revspredictbycntryyr, clear
drop if indstate==0
misstable sum tottradepernomgdpl, gen(miss)
mi set wide
mi register regular lnpopl postcoldwar cowcode year urbancivicny 
mi register imputed polityl gdppcthl  yrsincleaderinpower v2x_execorr lnoill tottradepernomgdpl
mi xtset cowcode year
mi impute chained (pmm, knn(10)) polityl gdppcthl yrsincleaderinpower v2x_execorr lnoill tottradepernomgdpl = lnpopl postcoldwar, add(10) rseed(1234) force dots chaindots
qui mi xeq 1: twoway (kdensity tottradepernomgdpl if misstottradepernomgdpl==0) || (kdensity tottradepernomgdpl if misstottradepernomgdpl==1) || (kdensity tottradepernomgdpl), legend(label(1 "Observed") label(2 "Imputed") label(3 "Completed"))
graph export Robustnesstestfiles\Logfiles\mod4impobstottradepernomgdpl.pdf, replace
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny lnpopl c.gdppcthl##c.gdppcthl c.polityl##c.polityl yrsincleaderinpower v2x_execorr lnoill postcoldwar tottradepernomgdpl, vce(robust)
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny tottradepernomgdpl time1 timesq timecub, vce(robust)
*	RESULTS:  tottradepernomgdpl--negative and statistically insignificant, did not change significance or signs of other variables
*		  no bivariate relationship

* Dollar exchange rate--lndollexchratel
* On 10 imputed datasets, testing Model 4 from Table 3.1
clear
use revspredictbycntryyr, clear
drop if indstate==0
misstable sum lndollexchratel, gen(miss)
mi set wide
mi register regular lnpopl postcoldwar cowcode year urbancivicny 
mi register imputed polityl gdppcthl  yrsincleaderinpower v2x_execorr lnoill lndollexchratel
mi xtset cowcode year
mi impute chained (pmm, knn(10)) polityl gdppcthl yrsincleaderinpower v2x_execorr lnoill lndollexchratel = lnpopl postcoldwar, add(10) rseed(1234) force dots chaindots
qui mi xeq 1: twoway (kdensity lndollexchratel if misslndollexchratel==0) || (kdensity lndollexchratel if misslndollexchratel==1) || (kdensity lndollexchratel), legend(label(1 "Observed") label(2 "Imputed") label(3 "Completed"))
graph export Robustnesstestfiles\Logfiles\mod4impobslndollexchratel.pdf, replace
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny lnpopl c.gdppcthl##c.gdppcthl c.polityl##c.polityl yrsincleaderinpower v2x_execorr lnoill postcoldwar lndollexchratel, vce(robust)
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny lndollexchratel time1 timesq timecub, vce(robust)
*	RESULTS:  lndollexchratel--positive and statistically insignificant, did not change significance or signs of other variables
*		  no bivariate relationship

* Age of incumbent leader--ageincleader
* On 10 imputed datasets, testing Model 4 from Table 3.1
clear
use revspredictbycntryyr, clear
drop if indstate==0
misstable sum ageincleader, gen(miss)
mi set wide
mi register regular lnpopl postcoldwar cowcode year urbancivicny 
mi register imputed polityl gdppcthl  yrsincleaderinpower v2x_execorr lnoill ageincleader
mi xtset cowcode year
mi impute chained (pmm, knn(10)) polityl gdppcthl yrsincleaderinpower v2x_execorr lnoill ageincleader = lnpopl postcoldwar, add(10) rseed(1234) force dots chaindots
qui mi xeq 1: twoway (kdensity ageincleader if missageincleader==0) || (kdensity ageincleader if missageincleader==1) || (kdensity ageincleader), legend(label(1 "Observed") label(2 "Imputed") label(3 "Completed"))
graph export Robustnesstestfiles\Logfiles\mod4impobsageincleader.pdf, replace
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny lnpopl c.gdppcthl##c.gdppcthl c.polityl##c.polityl yrsincleaderinpower v2x_execorr lnoill postcoldwar ageincleader, vce(robust)
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny ageincleader time1 timesq timecub, vce(robust)
*	RESULTS:  ageincleader--positive but statistically insignificant; did not change significance or signs of other variables; 
*		  bivariate specification--positive and significant at the .001 level

* Logged military expenditures per soldier--lnmilexppersoldthl
* On 10 imputed datasets, testing Model 4 from Table 3.1
clear
use revspredictbycntryyr, clear
drop if indstate==0
misstable sum lnmilexppersoldthl, gen(miss)
mi set wide
mi register regular lnpopl postcoldwar cowcode year urbancivicny 
mi register imputed polityl gdppcthl  yrsincleaderinpower v2x_execorr lnoill lnmilexppersoldthl
mi xtset cowcode year
mi impute chained (pmm, knn(10)) polityl gdppcthl yrsincleaderinpower v2x_execorr lnoill lnmilexppersoldthl = lnpopl postcoldwar, add(10) rseed(1234) force dots chaindots
qui mi xeq 1: twoway (kdensity lnmilexppersoldthl if misslnmilexppersoldthl==0) || (kdensity lnmilexppersoldthl if misslnmilexppersoldthl==1) || (kdensity lnmilexppersoldthl), legend(label(1 "Observed") label(2 "Imputed") label(3 "Completed"))
graph export Robustnesstestfiles\Logfiles\mod4impobslnmilexppersoldthl.pdf, replace
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny lnpopl c.gdppcthl##c.gdppcthl c.polityl##c.polityl yrsincleaderinpower v2x_execorr lnoill postcoldwar lnmilexppersoldthl, vce(robust)
mi estimate, ni(10) post dots eform saving(miest, replace): xtcloglog urbancivicny lnmilexppersoldthl time1 timesq timecub, vce(robust)
*	RESULTS:  lnmilexppersoldthl--negative and statistically insignificant, did not change significance or signs of other variables
*		  bivariate relationship is negative and statistically significant at the .01 level

* ++++++++++++++
* Further tests
* ++++++++++++++
* Country fixed effects are inappropriate given the rare event character of the data
* Introduction of dummy controls for world regions (North America and Western Europe as omitted regions)
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar africa latam mena formercomm soseasia eastasia if indstate==1, vce(robust) eform nolog
cloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar africa latam mena formercomm soseasia eastasia if indstate==1, vce(robust) eform nolog
firthlogit urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar africa latam mena formercomm soseasia eastasia if indstate==1, or
* Also rotated each region into the regression separately
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar africa if indstate==1, vce(robust) eform nolog
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar latam if indstate==1, vce(robust) eform nolog
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar mena if indstate==1, vce(robust) eform nolog
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar formercomm if indstate==1, vce(robust) eform nolog
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar soseasia if indstate==1, vce(robust) eform nolog
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar eastasia if indstate==1, vce(robust) eform nolog
*  RESULT
*    --none of the regional dummies are statistically significant
*    --no sign change or change in statistical significance for any of the variables

* Alternative sample urbancivicaltny with Model 4
*   Includes an additional 11 cases that were quasi-revolutionary but urban civic in character
xtcloglog urbancivicaltny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar if indstate==1, vce(robust) eform nolog
* failed quadchk--using pooled model
cloglog urbancivicaltny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar if indstate==1, vce(robust) eform nolog
*  RESULT
*    --no sign change or change in statistical significance for all variables

* Excluded Polity=0 in Model 4
xtcloglog urbancivicny lnpopl gdppcthl gdppcthl2 polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar if indstate==1 & polityl~=0, vce(robust) eform nolog
*  RESULT
*    --no sign change or change in statistical significance for all variables

* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
* Substitutability of percurbanl, under5mortl, and youthpercl for gdppcthl 
*	without affecting other variables
* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
cloglog urbancivicny lnpopl percurbanl polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar if indstate==1, vce(robust) eform nolog
cloglog urbancivicny lnpopl under5mortl polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar if indstate==1, vce(robust) eform nolog
cloglog urbancivicny lnpopl youthpercl polityl polityl2 yrsincleaderinpower v2x_execorr lnoill postcoldwar if indstate==1, vce(robust) eform nolog

log close

erase miest.ster

clear
